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Abstract. Though algebraic geometry over C is often used to describe the closure of the tensors 
of a given size and complex rank, this variety includes tensors of both smaller and larger rank. Here 
we focus on the n X n X n tensors of rank n over C, which has as a dense subset the orbit of a 
single tensor under a natural group action. We construct polynomial invariants under this group 
action whose non-vanishing distinguishes this orbit from points only in its closure. Together with an 
explicit subset of the defining polynomials of the variety, this gives a semialgebraic description of the 
tensors of rank n and multilinear rank (n, n, n). The polynomials we construct coincide with Cayley's 
hyperdeterminant in the case n = 2, and thus generalize it. Though our construction is direct and 
explicit, we also recast our functions in the language of representation theory for additional insights. 

We give three applications in different directions: First, we develop basic topological under- 
standing of how the real tensors of complex rank n and multilinear rank (n, n, n) form a collection of 
path-connected subsets, one of which contains tensors of real rank n. Second, we use the invariants 
to develop a semialgebraic description of the set of probability distributions that can arise from a 
simple stochastic model with a hidden variable, a model that is important in phylogenetics and other 
fields. Third, we construct simple examples of tensors of rank 2n — 1 which lie in the closure of those 
of rank n. 
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1. Introduction. The notion of tensor rank naturally extends the familiar no- 
tion of matrix rank for two-dimensional numerical arrays to d-dimensional arrays, 
and likewise has extensive connections to applied problems. However, basic questions 
about tensor rank can be much more difficult to answer than their matrix analogs, and 
many open problems remain. Several natural problems concerning tensor rank are to 
determine for a given field the rank of an explicitly given tensor, to determine for a 
given field and format the possible ranks of all tensors, and to determine for a given 
field and format the generic rank(s) of a tensor. While the matrix versions of these 
problems are solved by an understanding of Gaussian elimination and determinants, 
for higher dimensional tensors they have so far eluded general solutions. 

The case of 2 x 2 x 2 tensors, however, is quite well studied [9], and provides 
one model of desirable understanding: Over C or R, such a tensor may have rank 0, 
1, 2, or 3 only. Over C descriptions of the sets of tensors of each of these possible 
ranks may be given, in terms of intersections, unions, and complements of explicit 
algebraic varieties. These descriptions can be thus phrased as boolean combinations 
of polynomial equalities. Over R, analogous explicit descriptions require polynomial 
inequalities using '>' as well, and the descriptions are thus semialgebraic. For larger 
tensors of any given rank the existence of a semialgebraic description is a consequence 
of the Tarski-Scidenberg Theorem [3TJE7], but complete explicit descriptions are not 
known. 
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A polynomial of particular importance in understanding the 2x2x2 case is 
Cayley 's hyperdeterminant [7] , 

A ( p ) = (PoooPiu +P001P110 +P010P101 +P011P100) 

- 2(poooPooiPiioPm + PoooPowPwiPiu + PoooPoiiPwoPm 
+ P001P010P101P110 +P001P011P110P100 + P010P011P101P100) 

+ 4(poooPoiiPioiPno + P001P010P100P111), (1-1) 

also known as the tangle in the physics literature [5]. The function A has non-zero 
values precisely on a certain dense subset of those 2x2x2 tensors of complex tensor 
rank 2. If a tensor is real, the sign of the A further indicates information about its 
tensor rank over K: If A > 0, then the tensor has real tensor rank 2, and if A < its 
real tensor rank is 3. 

The role of A here can be partially understood as a consequence of it being an 
invariant of the group GL(2, C) x GL(2, C) x GL(2, C), which acts on 2 x 2 x 2 complex 
tensors in the three indices, and preserves their tensor rank. The transformation 
property of A under this group, along with explicit evaluation at a particular tensor 
of rank 2, implies it is non-zero on an orbit which is dense among all tensors of rank 
2. The fact that it is zero off of this orbit can be shown by first determining a list 
of canonical representatives of other orbits, and then explicitly evaluating A on them 
to see that it vanishes. Thus both the transformation property of A under the group 
and the ability to evaluate A at specific points are essential. 

In this work we focus on n x n x n tensors of tensor rank n, over C and over K, 
with the goal of generalizing our detailed understanding of the 2x2x2 tensors to 
this particular case. Although we do not translate our results here to the cases of 
ni x n 2 x n 3 tensors of rank n with n < Uj, this should be possible by applying maps 
of C ni —> C n . Thus what is most important about this case is that the dimensions 
of the tensor are sufficiently large that they do not put constrictions on studying the 
given rank n. (A more careful reading will show that for many arguments we only 
need n; > n for at least 2 values of i.) 

Over C, the rank-2 tensors are dense among all 2 x 2 x 2 tensors. However, for 
n > 2, the closure of the rank-n tensors in the nxnxn ones forms an algebraic variety 
of dimension strictly less than n 3 . From the closure operation, this variety contains 
all tensors of rank < n, as well as some of rank > n. Much previous work has focused 
on determining defining polynomials of this variety, that is, polynomials that vanish 
on all such rank n tensors. For n = 3, the ideal of polynomials defining this variety is 
known [13] . For n = 4a set-theoretic defining set of polynomials has been determined 
[lOl l^lHTj. For all n > 3, many polynomials in the ideal are known, through a general 
construction of 'commutation relations' [2j[3]. Moreover, the commutation relations 
give the full ideal up to an explicit saturation, and taking a radical. Nonetheless, the 
full ideal is still not understood if n > 4. 

In this article we turn from studying polynomial equalities related to tensor rank 
issues, to inequalities. Our main contribution is a generalization to arbitrary n of 
the n = 2 hyperdeterminant, A, of Cayley. We obtain polynomial functions whose 
nonvanishing singles out a dense orbit of the tensors of rank n from their closure. We 
emphasize that this generalization does not lead to those functions standardly called 
'hyperdetcrminants' in the modern mathematics literature |14| . but rather to a set 
of functions that generalize the properties of A in another way, appropriate to the 
problem at hand. Just as A defines a one-dimensional representation of GL(2, C) x 
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GL(2, C) x GL(2, C), our functions determine a multidimensional representation of 
GL(n,C) x GL(n,C) x GL(n,C). 

The usefulness of studying invariant spaces of polynomials for investigating tensor 
rank issues is, of course, not new (see, for instance, the survey [HJ, and its references, 
for instances going beyond A). Because the relevant representation theory is so highly 
developed, such study can be fairly abstract. While the gap between abstractly un- 
derstanding such polynomials and concretely evaluating them is conceptually a small 
one, in practice it is by no means trivially bridged ([6] gives an excellent illustration 
of this). Since our arguments depend crucially on being able to explicitly evaluate 
our invariant polynomials on certain tensors, a concrete approach to developing them 
is warranted here. 

Specifically, we investigate the transformation properties of our functions under 
a group action, a reduction under that action of most tensors to a semi-canonical 
form which is made possible by knowledge of the commutation relations, and the 
evaluation of our functions at these semi-canonical forms. Together, these allow us to 
make precise statements about the zero set of these polynomials within the closure of 
the rank-n n x n x n tensors that are analogous to statements about the zero set of 
A in the 2x2x2 case. After this initial development, we reframc our work in the 
language of representation theory. Finally we show how our generalization of A can 
be used for three different applications. 

This paper is organized as follows: After definitions and preliminaries in [J2] in 
Sj3] we recall relevant facts about the algebraic variety of n x n x n tensors of rank 
ri, and construct semi-canonical orbit representatives under the group action. We 
then construct our invariant polynomial functions in fJU and determine on which 
complex tensors of complex border rank at most n they vanish. Then in |JS]we study 
these functions in the framework of representation theory for the group GL(n,C) x 
GL(n,C) x GL(n,C). 

As a first application, in Sj6]we use these polynomials to investigate real tensors 
of complex rank n. We show that the zero set of our invariants divides the real points 
on the variety into several path components, each of which contains tensors of a single 
signature. These signatures are characterized by the number of complex conjugate 
pairs of rank-1 tensors in their unique rank-n decomposition. We also determine the 
number of path components of each signature, and find that the tensors of real rank 
n form a single component. We extend, from n — 2 to n = 3, the result that the 
sign of a polynomial invariant distinguishes whether a tensor of complex rank n also 
has real rank n. For larger n the sign of our invariant is insufficient to single out the 
component composed of tensors of real rank n, but why it fails to do so is made clear. 

In [J7]we turn to the application which originally motivated our interest innxrixn 
tensors of rank n, which is their appearance as certain statistical models, in both latent 
class analysis and phylogcnetics. For these applications (which we introduce more 
thoroughly in fj7|), such a tensor represents a joint probability distribution of three 
observed random variables, each with discrete state space of size n, and thus has non- 
negative entries summing to 1. Its decomposition into a sum of rank-1 tensors reflects 
the structure of the stochastic model, in which the distributions of each of the observed 
variables depend on the state of a common hidden (latent, or unobservable) variable 
with n states. The phylogenetic application can be seen through an interpretation 
of the observed variables as having 4 states, the bases A, C, G, T that may appear at 
a particular site in a DNA sequences from 3 species, while the state of the hidden 
variable represents the base in an ancestral organism from which the others evolved. 
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For these statistical applications, the rank-1 components of these tensors are 
themselves probability distributions, up to scaling, so it is important to not only de- 
termine the rank of a tensor, but also to be able to determine if the rank-1 components 
have non-negative entries. The 2x2x2 rank-2 case and its extension to phylogcnctic 
trees has recently been studied in [33] (see also [17j[25]). In that work Cayley's hy- 
pcrdetcrminant A played an important role. Our work here began as a step toward 
extending some of the results of |33j from n = 2 to n > 2. In this paper, however, we 
limit ourselves to the simplest phylogenetic model (on a 3- leaf tree), as the extension 
to larger trees depends on other ideas which will be presented in [J] . Despite lacking a 
good test for determining that a tensor of complex rank n has real rank ra, we borrow 
ideas from [3] to give semialgcbraic conditions that ensure a tensor is a probability 
distribution arising from the latent class model (with certain mild conditions on the 
parameters) . 

As a final application of the main theorems of this paper, in JjSjwc show examples 
of n x n x n. tensors of border rank n, but rank larger than n. We give a simple, 
explicit example of a tensor of the sort with rank 2n — 1. When n — 2 this gives the 
well-known canonical form of a complex rank 3 tensor; however, for general n > 2 it 
appears to be new class of examples of 'rank jumping' by a large amount. 

2. Three-dimensional tensors, group actions, and rank. Denote the space 
of all complex tensors of format (ni, 712,713) by S = S(ni,ri2,n3) = C" 1 ®C" 2 ®C" 3 . 
Note S = C" 1 ™ 2 " 3 , but that one may view an element of S concretely as a m x ri2 x 77,3 
array of complex numbers. We thus identify such tensors with three-dimensional 
hypermatrices. 

Let 

G(C) = GL(m,C) x GL(n 2 , C) x GL(n 3 , C), 

and 

G(R) = GL(ni,R) x GL(n 2 ,R) x GL(n 3 ,R) C G(C), 

which act on S through the 3 indices of tensors. We write this action on the right, 
using several interchangable notations, so that for P G S, (gx,g 2 ,gs) £ G(C) 

P(9li92,93) = ((P *1 9l) *2 92) *3 93 = ■■■ = ((P *3 .93) *2 92) *1 5l, 

where, for instance, 

"3 

(P *3 g3)ijk = P ijl93(l, k), 
1=1 

with similar formulas for the action in other indices. This notation is also useful for 
multiplication by vectors, so that if v 6 C" 3 , for instance, then P *3 v is a matrix 
with entries 

"3 

(p*3 = y^^PijkVk- 

k=l 

For i G {1,2,3} by the i-slices of P we mean the n, matrices P *j ej, whose 
entries have the i-th index fixed as j. For example for i = 3, a tensor P has matrix 
slices Pi, P2, . . . , P n3 , where Pj = P(-,-,j). The action of G(C) on a tensor can 
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be understood through transformation of these slices, as P' = P(gi, g-2, 1) has slices 
P- = gfPjto, while P" = P(I, I, g 3 ) has slices P» = £ fc P k g 3 (k,j). 

The complex tensor rank of P G S is the smallest integer r such that 

r 

P = y^Ui ® v» 8w>, 
1=1 

for some u, G C" 1 , Vj € C™ 2 , and w., G C" 3 . For a real tensor, the real tensor rank is 
defined analgously, requiring that Uj,Vj,Wj be real. Note that if a tensor is real, its 
real and complex tensor ranks need not be equal, though the complex tensor rank is 
an obvious lower bound for the real tensor rank. 

There is also a notion of multilinear rank of such a tensor, which is an ordered 
triple (ri, ^,7*3). Here is the rank of the transformation 

C" ! C™ 3 ® C" fc , 
v ^ P *j v, 

associated to P, and thus is the ordinary matrix rank of the rijTik x n 2 ; flattening of 
P. The multilinear rank of a real tensor is thus independent of the choice of field K 
or C, as the analogous fact holds for matrices. 

Both tensor rank (which we often will refer to as simply rank, or C-rank or K-rank 
if the field must be made clear) and multilinear rank are invariant under the action 
of the general linear groups. More precisely, if P' = Pg with g G G(C), then P' and 
P have the same C-rank and multilinear rank. If P is real and g G G(R), then P and 
P' have the same R-rank as well. 

For the remainder of the paper, we restrict our attention to the case 

m = ri2 = na = n, 

though, as pointed out in the introduction, many results arc easily modified to cases 
with Hi > n. 

If v G C", let diag(v) denote the nx n diagonal matrix whose (i, i)-entry is Vi. 
Similarly, let Diag(v) denote the n x n x n diagonal tensor with (i, i, i)-entry V{. In 
particular, if 1 is the vector with all entries 1, then diag(l) = /„ is the identity matrix. 
We denote its tensor analog by D = D n = Diag(l). 

By V(C) and V(R) we denote the G(C)- and G(M)-orbits of D, respectively. 

Proposition 2.1. 2?(C) is the set of all n x n x n complex tensors of C-rank n 
and multilinear rank (n, n, n). 

P(ffi) is the set of all n x n x n real tensors of M-ranfc n and multilinear rank 
(n, n, n). 

Proof. First, observe that D = Diag(l) has C- and R-rank n, and multilinear 
rank (n, n, n): The multilinear rank is clear since the n 2 x n flattenings of D all have 
the standard basis vectors among their rows. The C- and R-ranks of D are at most 
n, since 



n 

D = y^e^ (g> e f (8 e^. 

i=l 
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If D had C- or M-rank k < n, so D — J2i=i u » ® v i ® w i, then 

fc 

= D * 3 1 = ^(w, • l)Uj ® Vj 

i=l 

would have matrix rank < n, which is absurd. 

That every element of these orbits has the stated tensor rank and multilinear 
rank is a consequence of their invariance under the group actions. 

To see that every complex tensor P of rank n and multilinear rank (n, n, n) lies 
in T>(C), first write 

n 

p = u t ® v t ® w a . 

i=l 

Then since the first flattening of P to a n 2 x n matrix has rank n, the must 
be independent, so the matrix g\ with ith row is in GL(n,C). By the same 
reasoning, the matrices (72,53 with ith rows Vj,Wj arc in GL(n,C). Then one checks 
that P = £) (51,32,53) G f (C). The same argument applies in the real case. □ 

Note that not every tensor of C-rank n is in 2?(C). For instance, the tensor P 
with slices /, 0, 0, . . . , has tensor rank n, but is not in X>(C) since it has multilinear 
rank (n, n, 1). However, P is in the closure of D(C), since one can give a sequence {hi} 
of elements in GL(n,C) with lim/i, = le^", and then \im D(I, I, hi) = P. Similarly 
reasoning shows that any tensor of complex tensor rank < n is in the closure of T>(C), 
and that an analogous statement holds for V(R). 

Finally, note that the closures of X>(C) and T>(M) also contain tensors of rank 
> n. Indeed, the phenomenon that tensor rank may increase when one takes a limit 
is a key difference from the matrix rank. Wc will return to this with some explicit 
examples in 

3. The variety of rank n tensors, and certain orbit representatives. Let 

V n C C" denote the closure of 2?(C), under either the Zariski or standard topology, 
as these give the same set. This is the smallest algebraic variety containing all tensors 
of C-rank n and multilinear rank (n,n,n). (It is straightforward to see it is also 
the smallest variety containing all tensors of C-rank n.) Since V n is the closure of a 
G(C)-invariant set, V n is also G(C)-invariant. As mentioned in the preceding section, 
for all n > 2, V n contains both tensors of rank less than n and tensors of rank greater 
than n. Tensors in V n \ V n -i have rank > n, and are said to have border rank n. 

A key fact we will use is that for all n some defining equations for the variety 
V n are known, those given by the commutation relations [5J [3] . (The essential idea 
behind these seems to have first appeared in 

Proposition 3.1. The ideal I(V n ) of polynomials vanishing on V n includes those 
obtained from entries of the following matrix equations, i = 1,2, 3; 1 < j < k < n: 

(P *i ej) adj(P *i v)(P *i e fc ) - (P n e k ) adj(P *< v)(P n ej) = 0, (3.1) 

where v s C™ is an arbitrary vector, and 'adj' denotes the classical adjoint matrix. 

While the identity (|3.1j) still holds if the ej , ejt appearing in it are replaced by more 
general vectors, this only yields linear combinations of the identities above and thus 
no essentially new polynomials. Moreover, by treating v as a vector of indeterminates, 
and considering the coefficients of the monomials in v that result from equation (|3.1[) . 
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one may list a finite set of polynomials in P linearly spanning this set for all choices 
of v. 

In the case n = 2, these relations are all trivial (i.e., simplify to = 0). If n = 3, 
these polynomials are known to generate /(V3) [13J . For n > 4, it is known that 
additional polynomials are needed to generate I(V n ), but little is known about them. 
A reward offered by one of the authors (ESA) in 2007 for determining /(V4) has led 
to that question being called 'The Salmon Problem.' Currently, only set-theoretic 
defining polynomials have been determined [TU1 [5J E] • 

Though the orbit of D is dense in V n , additional orbits lie in V n as well. Next we 
show that some of the G(C)-orbits in V n have orbit representatives of a certain form. 
This semi-canonical form will be used for determining on which tensors the functions 
constructed in the next section vanish. 

Definition 3.2. An n x n x n tensor P is i-slice-non-singular if there is some 
C-linear combination of the i-slices that is non-singular, and slice-non-singular if it 
is i-slice-non- singular for some i G {1,2,3}. If P is not i-slice-non- singular, we say 
it is i-slice-singular. If P is not slice-non-singular, we say it is slice-singular. 

Note that P is i-slice-non-singular if, and only if, one can act on P in the ith index 
by an element of GL(n,C) to obtain a tensor with a non-singular i-slice. Thus the 
terms in the above definition all depend only on the G(C) orbit of P. To investigate 
orbits, we consider slice-singular and slice-non-singular ones separately 

Let x = (xi, X2, . . . , x n ) be a vector of indctcrminates. Then P is i-slice-singular 
precisely when /i.j(P;x) = det(P *j x) is the zero polynomial in x. Thus the i- 
slice-singular tensors form an algebraic variety, defined by setting equal to zero the 
coefficients of each x- monomial in the expansion of hi. 

Proposition 3.3. Suppose P is i-slice-non- singular and for that i satisfies the 
polynomials of equation (|3.1[) in Provosition \3. 1\ Then P has a G(C)-orbit represen- 
tative with all i-slices upper triangular. 

Moreover, if the matrix Z whose columns are the diagonals of the i-slices of such 
a representative is non-singular, then P G T>(C). If Z is singular, then an orbit 
representative exists for P with upper triangular i-slices and at least one slice strictly 
upper triangular. 

Theorem 14.11 below implies that the matrix Z of this theorem is non-singular 
precisely when P G T>(C). 

Proof. For convenience, suppose P is 3-slice-non-singular, with 3-slices Pi, P2, . . . , 
P n . Then, passing to other elements in its G(C) orbit, we may first assume P has a 
non-singular slice, and then that it has an identity slice, say Pi = I. But then the 
commutation relations of Proposition 13. II with v = ei say that for any I < j,k < n, 

Pj adj(P)P fe - P fe adj(Pi)P, = 0, 

so 

Pjpk = pkpj. 

Since these parallel slices commute, they can be simultaneously upper-triangularized, 
by a unitary g\. Thus acting by (gf , , I) G G(C), we may assume the slices Pj are 
all upper-triangular. 

Let Z be the matrix whose columns are the diagonals of the slices, i.e., Z(i,j) = 
P(i,i,j)- 
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Suppose first that Z is non-singular. Then acting on P by (I,I,g 2 ) for appro- 
priately chosen g 2 will preserve the upper triangular form of the slices, keep Pi = I, 
but make another slice, say P 2 , have distinct entries on its diagonal: To sec this, note 
that the action of (I,I,g 2 ) on P will send Z to Zg 2 . Choose some Z' whose first 
column is 1, second column has distinct entries, and is non-singular, and choose g 2 so 
Zg 2 = Z'. Then P' = P(I,I,g 2 ) will have all upper triangular slices, with 1 on the 
diagonal of P{ and distinct entries on the diagonal of P 2 . To see that in fact P{ = I, 
for any fixed i < j consider the row vector w,j = P(i,j, ■), whose entries come from 
the strictly upper triangular entries of the 3-slices. Now there is some row vector a 
with w.ij = aZ. But since the first entry of w.y is 0, and the first column of Z is 1, 
we see = al. Thus ~Wijg 2 = aZg 2 = a.Z' implies the first entry of Wijg 2 is also 0, 
since the first column of Z' is 1. 

Assuming now that P 2 has distinct entries on its diagonal, it can be diagonalizcd 
by acting on P by some (gj ,g$ , I), without changing Pi — I or the diagonal entries 
of P 2 . But then the commutation of P 2 with all other slices shows they are also 
diagonal. Moreover, Z being non-singular is equivalent to a statement that no non- 
zero linear combinations of the upper-triangular slices is nilpotent. This property 
is preserved by the action of (g^ , g^ 1 , /), and so the new matrix Z of diagonals is 
non-singular as well. Thus by a final action by (/,/, Z^ 1 ), we obtain D. 

If, on the other hand, Z is singular when the Pi are upper triangular, then there 
exits a 174 e GL{n, ( C) with Zg± having a column of zeros. Acting on P by (/,/, £74) 
preserves the upper triangular form of the slices, but ensures one slice has zeros on 
the diagonal. □ 

4. Construction of invariant functions, and their behavior on V n . In 

this section, for all n > 2 we construct explicit polynomial and rational functions 
on the n x n x n tensors, with invariance properties under G(C). When n = 2 this 
construction gives Cayley's hyperdeterminant A, though for larger n it appears to 
have not been studied before. We then investigate the values these functions take 
on when restricted to V n . Using Proposition ^. 3[ the cxplicitness of our construction 
allows us to show that the non- vanishing of the functions distinguishes the orbit £>(C). 

Let x = (xi, . . . , x n ) be a column vector of auxiliary indeterminates. For anxnxn 
tensor P, consider the following functions, for i G {1,2,3}: 



Here det M denotes the determinant of a matrix M, and H x is the Hessian operator 
on a scalar-valued function, giving the matrix of 2nd-order partial derivatives with 
respect to the indeterminates x. 

These functions are polynomials in the entries of P and x, homogeneous in each. 
Their degrees are: 



From the action of G(C) on tensors, the functions above inherit certain invariance 
properties. If P' = P(gi 7 g 2 ,g 3 ), and {i,j 7 k} = {1,2,3} then one sees 



hi(P;x) =det(P* iX ), 
/i(P;x) = (-l^detCJT^Pjx))). 



(4.1) 
(4.2) 



degp(^) 
deg P (/ 4 ) = n 



; ) = n, dcg x (/i. t ) = n, 
n 2 , deg x (/i) = n{n - 2). 



(4.3) 
(4.4) 



/ii(P';x) = det(gj) det(g k )hi(P] gix). 



(4.5) 
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Since by the chain rule, 

P x (/i 4 (P';x)) = det( 9j )det( 9k )gf ((H^P; 9i x)) 9i , 
taking determinants yields 

f q (P'; x) = det( gj ) n det(g k ) n dct( 9l ) 2 f q (P; 5i x). (4.6) 



Remark. When n = 2, note that /j(P;x) = fi(P) is independent of x, and can be 
seen to be independent of i as well, by calculating its explicit formula. Moreover, 
fi{P) = A(P) since in this case our construction is exactly Schlafli's construction of 
the 2x2x2 hyperdeterminant from the 2x2 determinant: Since det(P *i x) is a 
quadratic form when n = 2, the determinant of the Hessian of this form is the same as 
the discriminant of the form. Schlafli's construction is typically presented using the 
discriminant [14] . and that formulation then generalizes to yield (a multiple of) the 
hyperdeterminant for tensors of larger dimension (2x2x2x2, etc.). Our functions 
fi are a different generalization of the construction, for which the format of the tensor 
is n x n x n, and does not yield hyperdetcrminants. 

Remark. For n > 3, fi is not independent of the auxiliary indeterminates x. In clas- 
sical language, such a function might be called a covariant for the ith factor in G(C), 
or a concomitant (see, for instance, [16j [22l [24]). Only in the case n = 2 is /, an 
invariant in the strict sense of the term (i.e., associated with a one-dimensional rep- 
resentation). In [J5]wc will sec /,; is associated to a higher dimensional representation 
when n > 2. 

We next use the function fi to obtain a semialgcbraic description of the orbit 
2?(C). Since the vector x has indeterminate entries, by a statement that /i(P;x) = 
we mean that when fi is evaluated at P the resulting polynomial in x is identically 
zero. Thus /i(P;x) ^ means at least one coefficient of an x- monomial is non-zero 
at P. 

only if, for some i £ {1,2,3}, P satisfies 
Moreover, if these conditions hold for one 



Theorem 4.1. P e V(C) if an 
the equations (|3.1|) and /j(P;x) 7^ 0. 
i G {1, 2, 3}, then they hold for all. 

Proof. Since D x = diag(^i, xi, . 



one computes 



fti(D;x) = x\Xi 



(4.7) 



/ i (D;x) = (-l)"- 1 x 



dct 



/ ^3^4X5 • ■ ■ X n X2X4X 5 ■ ■ ■ X„ 

X3X4X5 ■■■ x„ X1X4X5 ■ ■ ■ x„ 



\X2X3X4 ■ ■ ■ X n -i 2:12:32:4 ••• a; n _l Xt_X2X4 ■ ■ ■ X n -! 



so 



( Xl X2---x n yMD;x) = (-lf-Met 



(xiX 2 ...X n ) 



V 



2:22:32:4 ■ ■ ■ 2T n _A 
X1X3X4 ■ ■ ■ X n -1 



V 1 1 



(0 1 1 INN 
1 1 ••• 1 
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hence 

fi(D; x) = (n - l)( Xl x 2 ■ ■ • x n ) n - 2 . (4.8) 

The transformation formula (|4.6[) then implies that if P 6 P(C) then /i(P; x) ^ for 
all i. That equations (|3.1[) hold when P G P(C) is stated in Proposition l3.il 

Conversely, suppose for some i that /j(P;x) 7^ and the equations (|3.ip hold. 
Then P must be i-slice- non-singular, since i-slice-singularity means hi(P;x) is the 
zero polynomial, which implies /i(P;x) = 0. Thus Proposition 13.31 applies, and we 
see P is G(C)-equivalent to a tensor P' with upper triangular slices in index i. If such 
a P' had a slice with diagonal 0, then det(P' *j x) would be independent of one of the 
Xi, so its Hessian would have a zero row (and column), implying /i(P';x) is the zero 
polynomial. By the transformation property (|4.6|) . it would follow that /i(P;x) = 
as well. Thus the slices of P 1 cannot have diagonals of 0, and Proposition 13.31 thus 
shows P G D(C). □ 

Note that the above theorem is only concerned with values of the fa on the variety 
defined by equations (|3.1[) ; it makes no statement about /j off this variety. 

Since the variety defined by equations (|3.1[) is a supervariety of V n , we immediately 
obtain the following. 

Corollary 4.2. P e T>(C) if, and only if, P G V n and /j(P;x) 7^ /or some 
("and hence all) i G {1, 2, 3}. 

Equations (|4.7[) and (|4.8[) suggest consideration of the rational function 



l[ ' J (n-l)/ii(P;x)™- 2 ' 

which is defined on the i-slice-non-singular tensors, and satisfies 

r ! (P;x) = l. (4.9) 

Moreover, if P' = P(gi, g2, 93), then the transformation formulas (|4.5|) and ()4.6[) yield 

ri(P';x) = det( 5 i) 2 dct( 52 ) 2 dct(. g3 ) 2 r 4 (P,. gi x). (4.10) 

Since rj(D; x) is independent of x, equation (|4. 10[) implies that r^P; x) is independent 
of x when P G T>(C), and thus, by continuity, when P G V n . 

Corollary 4.3. P G T>(C) if and only if, P satisfies the equations (|3.1|) and 
r^(P;x) is defined and non- zero for some (and hence all) i G {1,2,3}. 

In this statement the condition that P satisfies the equations (|3.1I) can of course 
be replaced by P G V n , as in Corollary 14.21 



While it is tempting to hope that Tj(P;x) is independent of x for all P, one can 
verify that this is not the case even when n = 3. Indeed, for the tensor 




P = 
we have that 

fx z 0\ 
/i 3 (P;x) = det z y = £yz — z 3 , 
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and 

(0 z y\ 
/ 3 (P;x) = det \z x = 2xyz + 6z 3 . 
\y x -6zJ 

Thus n(P;x) is neither independent of x, nor a polynomial function of x. This 
example can easily be modified for n > 3. 

Suppose now one had a polynomial, F(P), satisfying 

F(P') = det( 9l ) 2k dct( 52 ) 2fe dct(g 3 ) 2k F(P), (4.11) 

when P' = P(gi, 32, 53)- That is, suppose F(P) is an invariant of weight (2k, 2k, 2k) 
for G(C). Then provided F(D) ^ 0, we may normalize so that F(D) = 1, and then 
observe that by their transformation formulas 

n(P) k = F(P), for P e D(C) (4.12) 

and thus, by continuity, 

/,(P;x) fc = (n - l) k h l (P;^) k ^F(P) 

for all P <S V n . This yields the following. 

PROPOSITION 4.4. Let F(P) be an invariant of weight (2k, 2k, 2k) for G(C), such 
that F(D) 0. Then Theorem \4-l\ and Corollary \4-^\ remain true if ft is replaced by 
the function 

G l (P;x) = ^;(P;x) fc ("- 2 )p(P). 

While it is relatively straightforward to investigate the existence of polynomial 
invariants with weights of the type required for F(P) for small n, it is less easy to 
give them explicitly. We discuss this further in the next section. 

5. Representations and n x n x n tensors. The previous section took an ex- 
plicit, constructive approach to defining the invariants f{. Here we turn to a more 
general understanding of the representation theory of G(C). In particular, the trans- 
formation formula (|4.6[) of and Proposition 14.41 indicate that studying all polyno- 
mials with good transformation properties under the group action might be useful. 
As a general background to the material in this section, we suggest HH d3] 

5.1. Representations and decompositions. As we will be concerned only 
with complex representations of complex groups, we suppress mention of C in our 
notation in this section. We also let V = C" (which should not be confused with our 
use of of V n for an algebraic variety in other sections) . 

Recall that a representation of a group G is a homomorphism p : G —> GL(W). 
If W has no proper subspaces that are invariant under the action of G, then p is said 
to be irreducible. In particular, the irreducible representations of the general linear 
group on V = C" are well understood to be 

p\ : GL(V) S GL(n,C) -t GL(V X ) S GL(n x ,C), 

where the p\ are labelled by integer partitions A = (Ax, A2, • ■ ■ , A^), with Ai > A2 > 
. . . > X( > 0. We say A is a partition of m, and write A h m and |A| = m, when 
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2j=i = m - We refer to £ as the depth of A. The representing space V x = C" A (also 
referred to as a G-module), of dimension n\, can be expressed using the Schur functors 
V x = §a (V), but it is usually simpler to avoid explicitly doing so, and instead work 
with the characters of the representations. The characters {A} = trop A arc given by 
the Schur functions s\ [22], with 

{A} (g) = s x (0, 

where £ = (£1, £2, ■ • • , £n) are class parameters (eigenvalues) for g. Crucially, the Schur 
functions can be defined, and their properties explored, in a combinatorial manner 
quite independently of their role as characters for the general linear group [23| . 

The dimension n\ = of the representation p\ can be calculated by the 

hook length formula, which counts the number of semi-standard tableaux of shape A. 
For instance, the defining representation of GL(V) is associated to the partition (1), 
with rem = n, so {1} denotes its trace and sm(£) = £1 + £2 + • • • + £«• 

From two representations p\ and p\> of GL{V), one can construct the tensor 
product representation (p\ <g> p\>){g) := px(g) ® p\'(g), with character denoted {A} (g> 
{A'}. While this representation may be reducible, its decomposition into irreducible 
representations of GL(V) can be found using the pointwisc product of Schur functions: 

({A} ® {\>})(g) = * A (f) flv (0 = J2 c aW£) 

OL 

Here the multiplicities c" v are computable using the Littlewood- Richardson rule [33] , 
with, for instance, software such as Schur [32] . 

The irreducible representations of GL(n\) x GL{ri2) x GL(n^) are tensor products 
of the irreducible representations of the GL(rii): 

P\i x px 2 x px 3 : GL(ni) x GL(n 2 ) x GL(n 3 ) ->■ GL(n Xl ) x GL(n X2 ) x GL(nx 3 ) 

where (p Al x p A2 x px 3 )(gi, 32, 33) := PAi(fl'i) ® Pa 2 (.92) ® Px 3 (g3)- We denote the 
character of this representation by {Ai} x {A2} x {A3}, to distinguish it from the 
product of characters of the sort described in the last paragraph. 

The transformation of tensors Pe U = V ®V ®V under elements g = (31,32,33) 
of G = G(C) = GL(n,C) x GL(n,C) x GL(n,C) gives a representation of G with 
character {1} x {1} x {1}: 

tr(3i <g> 32 8> 33) = trfoi) tr(flr 2 ) tr(flr 3 ) = s ( i)(£)s ( i)(£'>(i)(£")- 

Denote the space of homogeneous polynomials of degree d in the components of tensors 
P G £/ by C[U]d- This space inherits an action of G by 

/ !->• 3 ° /, 

where 3 o /(P) := f(P(gi, 32, 33)); and hence forms a G-module. By a standard 
argument, it is possible to identify C[U]d = = Su) (U). While C[U]d is usually 
not an irreducible G-module, through characters we can identify its decomposition 
into irreducible modules. This is done by applying the corresponding Schur function 
plethysm, denoted by ®, 

({l}x{l}x{l})®M= E 7^UW * M x {a 3 }, (5.1) 

fl ,ct 2 ,CT3l-ra 
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where the multiplicities 70-10203 can be calculated using standard group theory tech- 
niques implemented in software such as Schur [32| . (See below for an outline, and 
[29J for a more complete explanation.) In terms of G-modules, we can use ([53]) to 
identify 

C[U] d = l^X 1 ®^ 2 ®^- ( 5 - 2 ) 

oi,02,o 3 l-n 

The primary focus of this section is to relate the functions /i(P;x) to this decompo- 
sition. 

The coefficients in a plethysm formula such as equations (|5.ip are the structure 
constants for Schur function "inner" products, denoted by *: 

W*{/3} = { E ^« 7 "> } ' if H = l^l = ^ 

I 0, otherwise, 

where 7^ is the multiplicity of the irreducible representation {fi} occurring in the 
decomposition of the tensor product of the irreducible representations {a} and {/?} 
in the symmetric group ©„. By linearity and associativity, we can similarly define 

M*{/3}*...*{C} = 5>^... C M, 

and the general plethysm of a product of defining representations is 

({1}x{1}x---x{1})®{/4= Yl 7^... c {«}x{/3}x...x{C}. (5.3) 

Equation (|5.1I) is then the special case of a three-fold product. 

Remark. If in equation (|5.3I) the characters {1} are replaced by {p}, {cr}, {r}, • • • , then 
the expansion on the right-hand side of equation (|5 ,3|) would be over the respective 
plethysms, ({p}®{a}), ({c}®{/3}), ({t}®{^}), • • • . The simpler case of equation ()5.3[) 
arises since {l}(S>{a} = {a}, {1}®{/?} = {/?}, etc. 

Remark. A familiar application of this theory is given by considering the k x k minors 
of an n x n matrix A. Under the action of GL(n) x GL(n) : A i-> .91 A? J \ f° r each 
integer 1 < k < n, the span of the k x k minors of A is an invariant subspace of the 
homogeneous polynomials of degree k in the entries of A. In terms of Schur function 
characters, the minors must therefore appear in the decomposition of the plethysm 

({l}x{l})®{fc}= J2 tSWxW- (5-4) 

a,/3\-k 

Adopting the standard shorthand of using exponents to signify repeated integers in a 
partition, (fc) and (l k ) = (1,1,..., 1) label the one-dimensional "trivial" and "sign" 

(k) 

representations of the symmetric group &k, respectively. The calculation Tn^nM = 1 
follows immediately from the tensor product sgn(a) €5 sgn(a) = sgn(a)sgn(a) = 1 
for all a G &k- 

Thus the character {l fc } x {l fc } appears exactly once in the above expansion. Via 

the hook length formula one finds dim(p( 1 k) x P(ifc)) = dim(p( 1 fc- ) ) 2 = (^) 2 , and the 
associated irreducible subspace is confirmed to be that spanned by the k x k minors 
of A. 
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5.2. One-dimensional representations. As a first use of this viewpoint, we 
investigate polynomial invariants of G(C), that is, one-dimensional representations 
within the vector space of polynomial functions. Proposition 14.41 indicates that any 
polynomial F satisfying equation (|4.11[) for any positive integer k can be used in 
characterizing the points on X>(C) (provided F(D) ^ 0), so we seek to find such F. 

The one-dimensional representation det(<7) fc of GL(n) has character {k n }. Via 
calculations with Schur [32], by decomposing C [U] for n = 2,3,4, we find there are 
polynomials in the entries of P transforming as {2 2 } x {2 2 } x {2 2 }, {2 3 } x {2 3 } x 
{2 3 }, {2 4 } x {2 4 } x {2 4 } of degree d = 4,6,8 for n = 2,3,4 respectively. Since the 
multiplicities of the representation in the decomposition are found to be 

i _ (4) _ (6) _ (8) 

1 — T(2 2 )(2 2 )(2 2 ) — 7(23)(23)(23) — 7( 2 4)(24)(2 4 )' 

the polynomial is uniquely determined up to scaling. Thus for n = 2,3,4 we denote 
this function by r„, and call it the n-tangle, since Cayley's hyperdeterminant r 2 = A 
is called the tangle in the physics literature, and all transform with weight (2,2,2). 
(We have not yet fixed a choice of scaling for t 3 , t 4 , but will below.) This progression 
stops with n = 4, but for n — 5,6 there are invariants transforming with weight 
(3, 3, 3), which again occur with multiplicity 1, and so are unique up to scaling. These 
results are summarized in Table [STT1 Beyond n = 7, computations with Schur become 
prohibitive, although we verified that there are no weight (2,2,2) representations for 
8 < n < 16. 



Tensor 




Weight 




format 


(2,2,2) 


(3,3,3) 


(4,4,4) 


2x2x2 


1 





1 


3x3x3 


1 


1 


2 


4x4x4 


1 


1 


5 


5x5x5 





1 


6 


6x6x6 





1 




7x7x7 











Table 5.1 

Multiplicities 7(^n)(fc n )(fc") °f one ~ dimensional representations of weight (k,k,k) in the space of 
homogeneous polynomials of degree nk in the entries of tensors of format n X n X n, as computed 
by Schur. 



While the existence of invariants with the desired transformation property for 
n < 6 is now established, to use them in Proposition 14.41 requires that we also know 
they do not vanish at D. Just as this can be easily seen from the explicit formula 
(jl.ip for the tangle when n = 2, to establish this in the cases n = 3, 4 we turn to an 
explicit construction of the invariant. 

For n = 3, let e^fe be the antisymmetric Lcvi-Civita tensor with £123 = £231 = 
£321 = 1, £213 = £321 = £123 = — 1) and Cy-fc = otherwise. Consider the degree 6 
polynomial r 3 : U — > C defined by 

3 

T 3(P) = ^ ] Pi 1 i2i3Pjlj2j3-Pk 1 k-2k 3 Pl 1 l-2hPm l m.2m. i Pn 1 n 2 n 3 x 
1 

e ii jifel e j2k2h e k 3 l 3 m 3 il 1 m 1 n 1 £m 2 n2i2 e n 3 i3j3 j 



TENSOR RANK, INVARIANTS, INEQUALITIES, AND APPLICATIONS 



15 



where all 18 indices run from 1 to 3 in the sum. Since Cyfe defines a one-dimensional 
representation of GL(3) by 

X] 9{hi')g{j,j')9{k,k')e V j> kl = det(g)e ijk , for g e GL(3, C), 

l<*',j',fe'<3 

it is straightforward to check that 

T3(P(9u92,g3)) = det( 5l ) 2 det(. g2 ) 2 det( 3 3) 2 T 3 (P) 

for all (51,32,53) G G(C). 

As is noted in |30j . expanding T3 as a polynomial yields 1152 terms, and thus 
it is not the zero polynomial. But we wish to establish the stronger statement that 
Tz{D) ^ 0. Expressing D in components = SijSjk, one first finds 

7"3(-^) ^ ^ ^ijk^-jkl^khn^lran^mni^-nij- 

l<i,j,A:,Z,m,n<3 

Now the first factor in the summand, eyfe, is zero unless k £ {1, 2, 3} are distinct. 
Then the product of the first two factors is zero unless additionally I = i. Considering 
the remaining four e factors in this way, non-zero contributions also require m = j, 
and n = k. Thus 

i,j,k distinct 

This confirms both that t 3 is a non-zero polynomial and that it evaluates to a positive 
value on the diagonal tensor. 

Similar considerations give the 4-tangle. With e^i denoting the sign of the 
permutation (ijkl) when i, j, fc, I are distinct, and otherwise, let 

4 

T&{P) = ^ ' Piii2i3-Pjlj2j:i-Pk 1 k2k3Pl 1 l2h-Pmi™.2m.3Pii 1 n2n3Pr 1 r2r 3 Ps 1 S2S3 x 
1 

^■iijikili ^minirisi ^22/2^2^2 ^j2k2ri2V2 ^-1233^113 123 ^3/3^ S3 ; 

where all 24 indices run from 1 to 4 in the sum. A polynomial expansion of this 
has 431,424 terms. By an argument analogous to that for T3, one sees that the only 
nonzero terms in 

4 

7~4(-^) — /| \ ^ijkl^mnrs^ihns^jknr^ijmn^-klrs 
1 

occur when m = k,n = l,r = i,s = j, and thus that Ti{D) = 24, confirming that 
Ti(D) is also non-zero. 

We do not have an explicit construction of invariants for n = 5, 6 that have weight 
(3, 3, 3), and we therefore do not know whether they vanish at D. 

5.3. Higher-dimensional representations. The functions fi constructed in 
Sj4]are not invariants when n > 2, due to their dependence on the auxiliary variables 
x, and thus do not define one-dimensional representations of G(C). We therefore turn 
to studying higher dimensional representations in polynomials. 
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A consequence of considering irreducible modules of polynomials is that state- 
ments concerning polynomials vanishing on G(C)-invariant sets which apply to a 
specific element of the module must also apply to the module as a whole. This is 
formalized in the two following lemmas. 

Lemma 5.1. Let f g C[U]d- Then ({g o / : g g G}) c , the linear span of the 
G-orbit of f , is a G-module. In particular, ifW is an irreducible submodule ofC[U]d, 
andOjt f eW, then ({g o / : g g G}) c = W. 

Proof. If p g ({g o f : g g G}) c , then, for some finite subset S C G and c/j g C, 

P=^2ch(ho f), 

hes 

Thus if g g G, 

9 ° P = X] °h (9 h ° /) e (if ° / : 5 e G}) c > 

so the linear span of the orbit is a G-modulc. 

Now for any G-module W and / 6 W, ({g o/:gg G}) c C W. Irreducibility of 
W and / ^ thus implies ({g o / : 5 g G}) c = W. □ 

Although as stated here this lemma applies to / in a G-module of polynomials, 
the result is a standard one for any G-modulc. Though also stated for polynomials, 
the next result holds more generally for G-modules of functions where the action of 
G arises from an action on their domain. 

Lemma 5.2. Let S be a G-invariant subset of U , and W C C[t/]<j an irreducible 
G-module. Then f\$ = for some non-zero f E W if, and only if, p\$ = for all 
peW . 

Proof. Let / g W with f\$ = 0. Then for any p g W, P G U, by the preceding 
lemma 

p(P) = J2ch(hof) (P) = J2 c h f( ph )- 

hes hes 

But P g S implies Ph g S, so this shows p\s = 0. □ 

There are several G-invariant sets of interest in this paper. They are X>(C), the 
orbit of the tensor D; V n = T>(C), the orbit closure; and V n \ T>(C), the complement 
of the orbit in its closure. However, by continuity, polynomials that vanish on 2?(C) 
vanish on its closure V n as well, so investigating polynomials vanishing on either of 
these sets leads to polynomials in the defining ideal of the variety V n . Indeed, some 
such polynomials are given in Proposition 13.11 though not from the point of view of 
representations. 

The functions /j(P;x) constructed in the however, are zero on V n \ T>(C), 
and non-zero on T>(C), by Corollary 14.21 Thus Lemma 15.21 suggests relating the 
classical viewpoint on fi used in their construction to the language of representations. 
Without loss of generality we focus on fs(P;x) and think of fs(P;x) as providing a 
set of functions fy( ■ ; x) : P i-> fz(P] x) parametrized by the auxiliary variables x. 

Notationally, we use non-negative integer vectors a = (ai, oi% y . . . , a n ) to express 
a monomial x a := x^x^ 1 . . . x^ n of total degree d = J2i<i<n a i- Then 



OL 
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with coefficients p a £ C[U] n 2, where the monomials x a are interpreted as basis ele- 
ments for C[V] d = V {d \ with d = n{n - 2). 

One can see directly that no p a is identically zero: To start, note that equation 
(|4~8)) gives 

^p a (D(I,I,g 3 ))x a = / 3 (D(7,7, ff3 );x) (5.5) 

a 

= dct( ff3 ) 2 /3(^;.9 3 x) 

= det( 53 ) 2 (n - l)(( 53 x)i( 5 3x) 2 . . . (.g 3 x)„)"- 2 . 

Choosing 53 € GL(n,C) with strictly positive entries, every possible monomial x a 
appears in the expansion of ((<; 3 x)i(<? 3 x)2 . . . (<7 3 x) n )™~ 2 . Hence p a (D(1, 1, 53)) 7^ 
for all a. 

To understand the transformation of p a under g = (31,(72,33) £ G(C), observe 
that 33 maps the monomial x a of total degree d by 

/ n \ ai / n \ Q2 / n \ Q " 

$^08(1, iiK H.93(2 , l2)Xi 2 J ■ . ■ I ^ .93(2, in)Xi n J 



The matrix elements ^(a, /3) provide precisely the irreducible representation p\ : 
GL(n,C) —> GL(ri\,C) with A = (d) and g 3 = pu){9z)- The polynomials p a {P) 
therefore transform under G(C) as 

Pa ^det( ff i)"det( 52 )"det( ff3 ) 2 ^P^3^,a). (5.6) 

This formula also implies that the p a are independent: If c = (c Q ) specifies a de- 
pendency relation ^ c Q p Q = 0, then from (|5.6p it follows that d = 33C gives another 
dependency relation for every choice of 33. By the irreducibility of P(d)-, and varying 
<73, this can happen only if all p a vanish identically, which they do not. 

Now let W be the span of {p a }- Since the p a are independent, (|5.6[) defines 
a linear map on W, making W an irreducible G(C)-module. The character of the 
corresponding representation of 67(C) is the product {n n } x jn"} x ({2"} (g) {d}), 
with d~n(n — 2). Application of the Littlcwood-Richardson rule [23J shows that, as 
a character of G7(n,C) where partitions of depth greater than n are excluded, the 
third factor is {2 n } ® {d} = {2 + n(n - 2), T^ 1 }. 

Thus the polynomial / 3 (P;x) is associated to a module of polynomials in 
the entries of P alone that transforms as {n n } x {tj™} x {2 + n(n — 2),2"~ 1 }. The 
dimension of such an irreducible module is calculated by the hook length formula as 



dim((n™)) x dim((n™)) x dim((2 + n(n - 2), 2"" 1 )) = 1 x 1 x 



f n{n - 2) + (n - 1) 
n(n - 2) 

n — 1 



71 — 1 

This result is not a surprise, since it is the dimension of the space of homogeneous 
polynomials in n variables of degree n(n — 2), i.e., the cardinality of the basis {p a } 
of W. 
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The multiplicity of modules transforming as {n n } x {n n } x {2 + n(n — 2), 2"~ 1 } 
in the decomposition of C [U] n2 can be calculated, at least for a few small values of n, 
using Schur, and are given in Table [5~2l Note that in the notation for the multiplicity 

(n 2 ) 

'T(n")(ri")(2+Ti(ri-2),2("- 1 )), 

whose values are given in the table, the superscript (n 2 ) should be read as a 1-part 
partition of the integer n 2 , while the subscripts (n™) denote the n-part partition 
(n, n, . . . , n) of n 2 in the standard shorthand notation for partitions. 



n 


Module 




Multiplicity 


Dimension 


2 
3 
4 
5 


{2 2 } x {2 2 } x 
{3 3 } x {3 3 } x 
{4 4 } x {4 4 } x 
{5 5 } x {5 5 } x 


m 

{5,2 2 } 
{10, 2 3 } 
{17, 2 4 } 


1 

2 
5 

10 


1 

10 

165 

3876 



Table 5.2 

Irreducible modules associated with the functions f% on n X n X n tensors, along with their 

multiplicities -/ n ' , and dimensions ( n in the space of homoqeneous 
polynomials of degree n 2 in the entries of P. 

For n = 3, the multiplicity of 2 shown in Table I5~2l allows not only for the existence 
of fi, but also for an additional function transforming in the same way. Indeed, 
from the transformation formula (|4.5[) for hi and the fact that T3 is an invariant of 
weight (2,2,2), the function Gj(P;x) = hi(P; x)t3(P) will be such a function. This 
is precisely the construction given in Proposition ^. 4[ and since Ts(D) ^ 0, that result 
applies. By the discussion preceding that proposition, fi/hi is not independent of x, 
and thus cannot be a multiple of T3. Thus fi and Gi are independent. 

Similarly, for n = 4, Gi = h 2 r^ and /, transform in the same way, and can be seen 
to be independent. While Table 15.21 indicates the existence of 3 other independent 
functions with the same transformation property, we have no explicit understanding 
of them, and do not know what, if anything, they indicate about tensor rank. 

6. Application to real tensors. In this section we investigate real tensors in 
T>(C). As Corollary 14.21 gives a semialgebraic description of 2?(C), it is natural to seek 
a similar description of P(M). To obtain conditions that define £>(R) we should of 
course include the additional condition that a tensor P be real. However, even in the 
case that n = 2 this is not sufficient to define P(ffi); one also needs that A(P) > [9]. 

Using our functions fi as a tool, our main results are as follows. For n = 3 the 
sign of an invariant function can be used to distinguish T>(U.), extending the n = 2 
result in this single case. For all n > 2, the zero set of our fi partitions the real points 
in £>(C) into connected components. Within a component, all tensors have the same 
number of complex conjugate pairs of rank-1 tensors in their rank decompositions. 
Moreover, with one exception, there is a single component for each allowable number 
of pairs. In particular one component of the real points is 2?(R). 

Let V n (K) = V n n R™ denote the real points on V n . 

Lemma 6.1. Suppose P £ 2?(C) n y n (IR). Then, up to simultaneous permutation 
of the rows of the gi, P can be uniquely expressed as P = D(gi, gi, g-$), subject to the 
following conditions: 

(i) The first non-zero entry of every row of gi and gi is 1, 
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(ii) For some k < n/2, the first 2k rows of every gi are complex (and neither 
real nor purely imaginary) , in conjugate pairs, and the remaining rows are real. 
Thus P has a unique decomposition into complex rank-1 components with 2k complex 
components, in conjugate pairs, and n — 2k real components. 

Proof. For some 51,52,33 € GL(n,<C), P = D(g\, 52, 33), which implies 

n 

p = Y,si®si®si, (6.1) 

i=l 

where is the ith row of gj. Since the rows of each gj are independent, Kruskal's 
Theorem [181 1191 126] implies this rank-1 decomposition is unique, up to ordering of 
the summands. However the individual vectors can be multiplied by scalars a*, as 
long as a^a 2 a| = 1. Requiring the first non-zero entries in each row of 31,32 to be 1 
removes that freedom. 

Since P is real, the complex conjugate of the decomposition in equation (|6.1I) 
must give the same decomposition, up to order. Thus for each i either g\ ® g 2 ® g l 3 
is real, or its complex conjugate also appears as a summand. 

We may thus simultaneously permute the rows of the gj so for i = 1, . . . , k 

gf" 1 ® gf' 1 ® gf" 1 = gf ® gf ® gf, 

and for 2k < i < n the summand is real. 

Having done this, since each gj,g 2 has an entry of 1, from the conjugate sum- 
mands we first see that g 3 *~ = g 2 ' for i = 1, . . . , k and then that similar statements 
hold for the rows of 31 and 32. Thus all three gi have the first 2k rows in conjugate 
pairs. Moreover, none of these first 2k rows of any gi is real, lest there be a repeated 
row, contradicting that gi £ GL(n, C). Likewise, these rows are not purely imaginary. 

That the remaining rows of the gi are real follows by an analogous argument. □ 

For P G T>(C) n V n (M.), we refer to the ordered pair (n — 2k, k) of this lemma as 
the signature of P. For P ^ -D(C) n V n (M.), the rank-1 tensor decomposition may not 
be unique; so we leave the signature undefined. The orbit 2?(R) thus comprises those 
P € 2?(C) with signature (n, 0). 

Information on the signature of a tensor can be obtained from the value of the 
function defined in section [4J as we now show. 

Theorem 6.2. For all n, the set of tensors P £ V n (M) for which r,(P) > is 
precisely those tensors of signature (n — 2k, k) with k even. 

In particular, for n = 2, P(M) is precisely the set of tensors P € V^R) with 
A(P) > 0, and for n = 3, U(M) is precisely the set of tensors P G V3(R) with 
r 3 (P) > 0. 

For any n > 3, £>(R) is precisely the set of tensors P £ V^(R) with fi{P] x) 7^ 
such that when /s(P;x) is factored into linear forms as 

n 

/ ?; (P;x)= c n W 1 - 2 , 

3=1 

then all of the linear forms lj may be taken to be real. 

In this statement the conditions P G V^(R) with /j(P;x) ^ can be replaced, 
by Theorem 14.11 and Corollary 14.21 by an explicit collection of polynomial equalities 
and/ 4 (P;x)^0. 
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Proof. If P has signature (n — 2k, k), then P = D(gi, g 2 , g%) where each g t has 
exactly k pairs of complex conjugate rows. Thus g i = agi for some permutation 
a with deter = (— 1) . This implies dct<?i = (— l) fc det(?i, so dct^ is real or purely 
imaginary according to whether k is even or odd. Thus (det gi) 2 is positive or negative 
according to whether k is even or odd. Using the properties of r, given in formulae 
(|4.10|) and (|4.9p the first claim is established. 

Note that for n — 2,3, the only allowable even value for k is zero. The claim 
for n = 2 is then immediate since = A in that case. For n = 3 the claim follows 
similarly, from the fact that T3 is of weight (2,2,2) and ts(D) > 0, so, by equation 
(|4.12p . T3 is a positive multiple of Ti when restricted to T>(C). 

For arbitrary n > 3, P € V n and /j(P;x) is equivalent to P £ 2?(C) by 
Corollary 14. 21 By equations (|4.6|) and (|4.8p . we have that for P — D(gi,g 2 ,g^), 

n 

/<(P;x) = c'/ip.ftx) =cIJZ j (xr- 2 J 

3=1 

for some scalars c',c, and linear forms Zj defined by the rows of <?i. If P € X? (M), so 
the gt may be taken to be real, /i(P;x) has a factorization using real linear forms. 
For n > 3 and P ^ P(R), by Lemma \Q. II such a factorization exists with at least two 
lj complex and not rescalable to be real. By unique factorization in the ring C[x], 
there can be no factorization into powers of real linear forms in this case. □ 

Note that the statement in this theorem about the factorization into linear forms 
of fi{P; x) could be replaced with a similar one about hi(P; x). 

We next consider the connected components obtained from V n (M) by removing 
the zero set of an /j. Let Z n = {P G V^(ffi) | /j(P;x) = 0}, and note that Z n is 
independent of the choice of i G {1, 2, 3}, by Corollary 14.21 

Theorem 6.3. On each path component ofV n (R) \ Z n the signature is constant. 
For each < k < n/2, there is 1 path component of signature (n — 2k, k). When n is 
even there are 4 components with signature (0,n/2). 

Proof. Suppose a component of V n (M) \ Z n contains tensors of signature (n— 2k, k) 
for two different values of k. Let k(P) denote the second term in the signature 
(n — 2k, k) of a tensor P. Then in this component we can choose a tensor Pq and 
sequence of tensors Pi, P 2 , P3, . . . with lim^oo Pg = Pq, fc(Po) = k^^k\ = k(Pg) for 
I > 1. But then 

lim / i (P^;x) = / i (P ;x). (6.2) 
By Theorem 16.21 for each i > the function /^(P^;x) factors as 

n 

MP^^^ctHk^r- 2 , (6.3) 

where we assume the linear forms in this factorization have been normalized so that 

l Lj (x) =ni,j -x, 

with ||u£ j- 1 1 = 1 for all j. We further assume complex vectors U£ t 2j-i = U^2j for 
1 < j < fci are associated to the the non-real linear forms, and real vectors u/j 
2fci < j < n to the real ones. 
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By compactness of the unit sphere in C™, passing to a subsequence of {Pi}, we 
may assume that for each j 



lim \i£ i 



for some unit vector Uj. Let Zj(x) = Uj ■ x. Now by equations (|6.2I) and (|6.3|) we see 



MP ;x)= lim c ( miiW"" 2 - 

3 = 1 



Since for ?i — 2fci values of j the u^- are real, at least this many of the Uj are. Thus, 
since fco/ii, we have fco < ki- 

However, ko < k\ implies that for some 1 < j < k\, \\2j = hm.«->oo u -£,2j is real. 
But since u^2j-i = W.2j, this means U2j = u.2 3 — i. That is impossible, as these 
vectors are, up to scaling, rows of some where Pq = D(gi, g2, 53), and thus must 
be independent. 

Thus the signature is constant on each component. 

The number of path components that exist for any fixed value of k, < k < n/2, 
is always at least 1, since one can construct a real tensor of signature (n — 2k, k). We 
now turn to giving an upper bound on the number of such components. 

Note first that a 2 x n matrix with complex conjugate rows can be expressed as 

; ■«) g 

for row vectors i"i,r2 £ R". Thus if is an n x n block diagonal matrix with k 

blocks of ^} * and n — 2k singleton blocks of 1, then tensors in \ Z n with 

signature (n — 2k, k) form the G{M) orbit of D{J^, J^, Jj.). We thus seek to bound the 
number of path components of this orbit. 

Recall that GL n (R.) has two path components, G£+(R) and GL~(M.), with mem- 
bership according to the sign of the determinant. Then G(M) has 8 path components, 
one of which is 

G + (R) = GL+{R) x GL+(R) x GL+{R). 

The trivial bound on the number of components of the G(K) orbit of D{J)~, Jk, Jk) is 
thus 8. 

Suppose k < (n — l)/2, so tensors of signature (n — 2k, k) have at least one real 
rank-1 component, and Jk has at least one lxl diagonal block, which we assume is 
the last. Let K = diag(l, —1), and observe that since JkK = KJk, 



D(J k , J k , J k )(K, I, I) = D(K, I, I)( J k , J k , J k ) 
= D{I,K, I){J k ,Jk,Jk) 
= D( J k ,Jk,Jk){I, K,I), 

and similarly D(J k , J k , Jk)(I, K, I) = D(J k , J k , Jk)(I, I, K). Since det(K) = -1, this 
implies that the G(R) orbit of D(Jk, Jk, Jk) is the union of the G + (R) orbits of 
D{Jk, Jk, Jk) and D(Jk, Jk, Jk)(K, I, I)- To show there is only one path component, 
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we now need only show D(J k , J k , J k ) and D(J k , J k , J k )(K, I, /) arc in the same com- 
ponent. 

If k < (n— 1)/2, then J k has at least two lxl diagonal blocks in the last positions. 

Let 

R *=(-l j)' CT2= (l J)' 

Then one checks that 

D 2 {<J2,R2,R 2 ) = D 2 . 

Letting R be the n x n block diagonal matrix R = diag(l, 1, . . . 1, R2), and a = 
diag(l, 1, . . . , 1, 02), it follows that 

D(J k , J k , J k )(a, R, R) = D(J k , J fe , J fc ). 

Since det(i?) = 1 and dct(cr) = -1 = det(A'), D(J k , J k , J k )(a, R, R) is in the G+(R) 
orbit of, and hence the path component of, D(J k , J k , J k )(K, I, I). Thus D(J kl J k , J k ) 
and D(J k , J k , J k )(K, J, I) are in the same component, and there is only one path 
component of signature (n — 2k, k). 

In the case when n is odd and k = (n — l)/2, J k has a single lxl block in the 
last position. Observe that 




so if L = diag(l, 1, . . . , 1, —1, 1), then 

J k L = diag(l, 1, . . . , 1, cr 2 , 1) Jfe- 

Thus 

D(J k , J k , J k ){L, L, L) = D(J k , J k , J k ). 

Now D(J k , J k , J k )(LK, LK, LK) is in the G + (R) orbit of, and thus path component 
of, D( J k , J fe , Jfe), but 

D{J k , Jfe, J k )(LK, LK, LK) = D{J k ,J k , J k )(L, L, L){K, K, K) 

= D(J k ,J k ,J k )(K, K,K) 
= D(J k ,J k ,J k )(K,I,I). 

Thus there is only one path component of signature (1, k). 

If n is even and k = n/2, J k has only 2x2 blocks on its diagonal. Then J k K = aJ k 
where a = diag(l, 1, . . . , 1, 02). Thus 

D{J k , Jfe, J k )(K, K, I) = D{a, a, I){J k ,J k , J k ) 
= D(I,I,a){J k ,J k ,J k ) 
= D(J k ,J k ,J k ){I,I,K), 

with similar formulas for the action of (K,I,K) and (I,K,K) on D(J k , J k , J k ). 
One then sees the G(M) orbit of D(J k , J k , J k ) is the union of the G + (M) orbits of 
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D(J k ,J k ,J k ), D(J k K,J k ,J k ), D{J k: J k K,J k ), and D(J k , J k , J k K). To show there 
are 4 path components of signature (0, fc) it remains to show these four tensors lie in 
different components. 

To this goal, consider a point Pq of signature (0, fc), so 

P = D(J kl J k ,J k )(g 1 ,g 2 ,g 3 ), 

with (51,32,53) G G(R). Since /;(Po;x) ^ 0, we also have hi(Po;x) ^ 0. But by 
equation (|4.5I) . 

/i 3 (P ;x) = dct(. 9l )det( 52 )dct(J fc ) 2 / l3 (L>(/,/, J k );g 3 x) 
while a direct calculation shows 

fc 

h 3 (D(I,I,J k );x) = H(x 2 2i _ 1 + x 2 2i ). 

i=l 

Thus h 3 (Po; x) is a non-zero polynomial in x whose values are either non- negative on 
all of R", or non-positive, with similar statements valid for hi,h,2- But a straight- 
forward continuity argument shows that along a path composed of points P with 
signature (0, fc) the polynomial /i^(P;x), viewed as a function of x, cannot pass be- 
tween being non-negative valued and non-positive valued without being identically 
zero for some P. Since it is not identically zero at any point of signature (0,fc), on 
each path component it must be either non-negative valued for all P, or non-positive 
valued for all P. 
But since 

h 3 (D{J k ,J k ,J k );x) = h 3 (D{J k ,J k ,J k )(I,I,K);x) 

= -h 3 (D(J k ,J k ,J k )(K,I,I);x) = -h 3 (D(J k> J k> J k )(I,K,I);x) 

and 

h x {D{J k ,J k ,J k )-x) = hxD(J k , J k , J k )(K,I,I);x) 

= -hx{D{J k , J k , J k )(I,K,I);x) = -h x {D(J k , J k , J k )(I, I, K);x), 

we can conclude that the 4 points all lie in different path components, and so there 
are 4 path components of points of signature (0, fc). □ 

Remark. Even in the well-studied case n = 2, the assertion of Theorem 16.31 seems to 
be a new result. 

7. Application to a Stochastic Model. We consider here a discrete statistical 
model with a single hidden variable, in order to obtain a semialgebraic description of 
its set of probability distributions 

As a graphical model, it is specified by a 3-leaf tree, as shown in Figure 17.11 
The internal node of the tree, and each leaf, represent random variables, all with 
n states. The internal node variable is hidden (i.e., unobservable) . The observed 
leaf variables are independent when conditioned on the state of the hidden one. The 
hidden variable thus provides an 'explanation' of dependencies between the observed 
ones. This simple conditional independence model with a hidden variable is common 
in many statistical applications, and variously called a hidden naive Bayes model, a 
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Fig. 7.1. A graphical model, in which the 3 leaves represent observed random variables, 
X±, X2, X3, and the central node a hidden random variable, X^. As considered here, all variables 
are assumed to have n states. The structure of the graph indicates the leaf variables are independent 
when conditioned on the hidden one. 

latent class model, or a 3-leaf tree model, though often the four variables are allowed 
to have state spaces of different sizes. 

In applications, the observed variables in this model might represent three char- 
acteristics (such as the results, + or — , of medical tests) measured on individuals in a 
population, while the hidden variable represents a 'latent class' to which the individ- 
ual belongs (such as whether the individual has or does not have a certain disease). 
The probabilities of the test outcomes depend on the disease condition, yet given an 
individual's disease state, the results of the tests are independent of each other. 

Given this model for fixed n, one can view a probability distribution arising from it 
as an ?i x 71 x n tensor. A natural question is to find a semialgebraic characterization 
of such tensors, that is, a collection of polynomial equalities and inequalities that 
precisely cut out the distributions arising from the model. The structure of the 
probability model ensures that the rank of the tensor is at most n, and moreover 
that each summand in a decomposition as a sum of n rank-1 tensors has non-negative 
entries. Understanding polynomials equalities holding on these distributions amounts 
to understanding the defining ideal of the variety V n , work on which was reviewed 
in Sj3j Inequalities holding on such tensors are much more poorly understood. While 
the existence of inequalities in the tensor entries that ensure it arises from meaningful 
stochastic parameters (e.g., non-negative) follows from general theory of real algebraic 
geometry, explicit inequalities have previously not been given for arbitrary n. 

Our interest in the model is motivated by its appearance as the general Markov 
model in phylogenetics, where in the special case n = 4 it is used to model evolution of 
DNA sequences by base substitution. One might think of the unobserved variable as 
representing the base (A, C, T, or G) at a site in a sequence of an ancestral organism 
from which we have no data, and the observed variables as the state of the site in 
three currently extant descendants of it. Though trees with more than 3 leaves are of 
course essential for phylogenctic applications, in a related work [3] it is shown how to 
extend a semialgebraic description for the 3-leaf tree to m-leaf trees. 

Denoting this model by A4 n , we first describe its parameterization. With [n] = 
{1,2,..., n} as the state space of all random variables, let it = (tti, . . . , w n ) denote 
the probability distribution vector for the hidden variable Xh, so TTi = P(Xh = i). For 
the observed variables Xi, i = 1, 2, 3, a n X n stochastic matrix Mi has (j, fc)-entry 
specifying the conditional probability P(X, = k \ Xh = j), so each row of each Mi 
sums to 1. The connection of this model to the tensor rank questions we study in this 
paper arises from the observation that a probability distribution for M. n is specified 
by the n x n x n tensor 



P = Diag(7r)(M 1 ,M 2 ,M 3 ). 
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The domain we consider for the parameterization map is specified by requiring 

1. 7r has strictly positive entries summing to 1, 

2. the Mi have non- negative entries and row sums of 1, and 

3. the Mi are non-singular. 

For some statistical applications it is also natural to strengthen the second requirement 
so that the Mi have strictly positive entries; we thus comment on this situation as 
well in Proposition 17.11 below. 

Note that a few trivial inequalities in the entries of P that must hold are obvious: 
Since P € A4 n is a probability distribution, its entries must be non-negative. If one 
additionally assumes the Mi have strictly positive entries, then P must have strictly 
positive entries as well. 

For n — 2, a complete semialgebraic description of M2 has been given in two 
recent independent works [331 117 j . using different approaches. In particular, in [33) 
the 2x2x2 hyperdeterminant A plays a key role, though many statistically-motivated 
ideas are also used. Here we give a semialgebraic description of M n for all n > 2, 
using the invariants developed in earlier sections that generalize A. 

Recall a principal minor of a matrix is the determinant of a submatrix chosen 
with the same row and column indices. A leading principal minor is one for which 
these indices are {1,2,3..., A;} for some k. 

Proposition 7.1. A n x n x n tensor P is in the image of the parametrization 
map for M n if, and only if the following conditions hold: 

1. P is real, with non-negative entries summing to 1. 

2. For some (and hence all) i, P satisfies the commutation relations given by 
equation (|3.1[) . and the polynomial /j(P;x) is not identically zero. 

3. det(P *i 1)^0 for all i e {1, 2, 3}. 

If.. For at least one (and hence all) of the following matrices, all leading principal 
minors are strictly positive: 

det(P *i 1)(P * 2 1) adj(P *i 1)(P * 3 1) T 

det(P * 2 1)(P *i 1) adj(P * 2 1)(P * 3 1) (7.1) 
det(P * 3 1)(P *i if adj(P * 3 1)(P * 2 1) 

5. For all 1 < I < n, all principal minors of the three matrices 

det(P *x 1)(P * 2 1) adj(P * x 1)(P * 3 e ; ) T 
det(P *i 1)(P * 2 e,) adj(P * x l)(P * 3 1) T (7.2) 
det(P * 2 1)(P *i e/) adj(P * 2 1)(P *3 1) 

are non-negative. 
Here adj(Af) denotes the classical adjoint of a matrix M. 

If parameters of the model are restricted so that entries of Mi are strictly posi- 
tive, then in condition one should replace 'principal minors ' by 'leading principal 
minors' and 'non- negative' by 'positive'. 

Note that the only equality constraints in the theorem are those in conditions (TTJ 
and ©. In particular, a full set of generators of the ideal I{V n ) is not used (when 
n > 3) in this semialgebraic description of the model. 

Our proof will use repeatedly the following well-known classical result on matrices 
defining quadratic forms. 
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Theorem 7.2 (Sylvester's Theorem). Let A be an n x n real symmetric matrix 
and Q(v) = v T Av the associated quadratic form on R". Then 

1. Q is positive definite if and only if all leading principal minors of A are 
strictly positive. 

2. Q is positive semidefinite if and only if all principal minors of A are non- 
negative. 

Proof, [of Proposition I7.1j 

We first discuss the necessity of these conditions. The necessity of (Q} is clear. 
Condition holds by Theorem 14. 1[ since 

P = Diag(7r)(M 1 , M 2 ,M 3 ) = D(diag(ir)Mi, M 2 , M 3 ) 

shows P is in the G(C)-orbit of D. 

For ((3|), observe that P 1 is the marginalization of the distribution to two 
observed variables, so one sees 

P* l l = A/ J T diag(7r)A4 

for distinct i,j,k. Since tt has positive entries, and Mj,Mk are non-singular, the 
determinant of this matrix is non-zero. 

Condition ([4]) can be restated, after dividing the first formula of (|7.ip by the 
positive number det(P *i l) 2 , as asserting the positivity of leading principal minors 
of 

(P*al)(P*i 1)" 1 (P* 3 1) T , 

and two similar expressions. But expressed in terms of parameters, this is 

(Aff diag(7r)M 3 )(M 2 T diag(7r)M 3 ) _1 (Mf diag(7r)M 2 ) T = M X T diag(7r)Mi. 

This symmetric matrix, and similar ones obtained from the other expressions, define 
positive definite quadratic forms because ir has positive entries. Thus by Sylvester's 
Theorem, all their principal minors are positive. 

A similar argument shows the necessity of condition ((5j. For instance, letting rf 
be the vector whose entries are the products itiM^i, I), one sees 

det(P*i 1)(P* 2 l)adj(P*i l)(P* 3 e ; ) T = det(P*i 1) 2 M? diag(rf )M X . 

Since the entries of M3 are non-negative, this matrix defines a positive semidefinite 
quadratic form, and thus by Sylvester's Theorem has non-negative principal minors. 
If the entries of the Mj are positive, then this matrix defines a positive definite form, 
and thus has positive leading principal minors. 

Turning to sufficiency, assume conditions <jT][5]) are met by a tensor P. By Thco- 
rcm l4.1[ condition ([2]) implies P = D(gi,g2, (73) for some gi £ GL(n, C). Moreover, by 
the rcalncss of P in condition ((T|), from Lemma 16. II we also know any complex entries 
in the gi occur in complex conjugate rows. Our goal is to modify this expression, so 
the gi are replaced by stochastic matrices, and D by a diagonal tensor with positive 
entries. 

Letting S; = g{\. be the vector of row sums of gi, we have 

P*! 1 =gj diag(si)g 3 , 
P* 2 1 = gf diag(s 2 )3 3 , 
P *3l = gi diag(s 3 )g 2 - 
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Thus the non- vanishing of the determinants of these matrices by condition ([3]) tells 
us the row sums are all non-zero. Letting Mj = diag(sj) <?j, and 7r be the vector of 
entry-wise products of si, S2, S3, we thus have P — Diag(-7r) (Mi , M 2 , M 3 ). Here each 
Mi has unit row sums, 7r has non-zero entries, and 

P*i 1 = M 2 T diag(7r)M 3 , 
P * 2 1 = diag(7r)M 3 , 
P * 3 1 = Ml diag(7r)Af 2 . 

Since P is real, these expressions are as well, though we have not yet shown that 
Mi and 7r have real entries. Nonetheless, all Mi have the same number of conjugate 
(non-real) pairs of rows, in corresponding positions, with the corresponding entries of 
7r also conjugate (though possibly real). 

Now substituting the above expressions for marginalizations in the three expres- 
sions in condition (j4|), they simplify to 

det(P *j if Ml diag(7r)Mi, 
det(P * 2 1) 2 M 2 T diag(7r)M 2 , 
det(P * 2 I) 2 diag(7r)M 3 . 

This shows that Mf diag(7r)Mj is real for each i. We now argue that if Mj is not 
real, then the quadratic form Qi associated to Ml diag(-7r)Mj is not positive definite. 
To that end, suppose two rows (say, the first two) of Mj are complex conjugates, and 
thus by Lemma 16.11 of the form 

m-=ri+ir 2 , m? = = ri - ir 2 , r,el"\{0} 

and the corresponding entries of tt are 7Ti , 7r 2 = Wx- Then for any real vector v 
orthogonal to the real and imaginary parts of the other rows of Mj, evaluating the 
quadratic form at v yields 

Q l (v) = 7r 1 (m 4 1 .v) 2 +7f 1 (mi.v) 2 . 

If we additionally choose v to be orthogonal to r 2 , but not to it , then 

Qi(v) =7Tl(n • V) 2 +Tfi(n -V) 2 =2K(7Ti)(n -v) 2 . 

Positive definitcness of Qi would thus imply 3?(7Ti) > 0. However, if we instead choose 
v to be orthogonal to it , but not to r 2 , Then 

Qi(y) = 7ri(ir 2 • v) 2 + 7Fi(ir 2 • v) 2 = -23^) (r 2 • v) 2 , 

so positive definiteness would imply < 0. Thus if Mj were not real, then Qi 

would not be positive definite. 

But if Qi is not positive definite, by Sylvester's Theorem, the positivity of leading 
principal minors asserted in condition Q must be violated. Thus condition Q implies 
at least one of the Mj is real, so all are by Lemma HTT1 Applying Sylvester's theorem 
again to the positive definite form Qi then implies that 7r has real positive entries. 

Finally, condition ([5]) ensures the entries of the Mj are non-negative. For instance 

dct(P*il)(P* 2 l)adj(P*il)(P* 3 e :) ) T = dct(P*il) 2 M 1 T diag(Tr) diag(m 3 )Mi. (7.3) 
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where rh J 3 is the jth column of M 3 . Thus all principal minors of this matrix being 
non-negative implies the associated quadratic form is positive semidefinite and thus 
that mj has non-negative entries. To instead ensure these entries are strictly positive, 
we require that the quadratic form be positive definite, and thus that all leading 
principal minors be positive. □ 

For further work in this direction, we direct the reader to j4|. 

8. Application to 'rank jumping'. It is well known that the limit of a se- 
quence of tensors of a fixed rank r > 1 may be strictly larger than r (that is, tensor 
rank, unlike matrix rank, is not upper scmicontinuous). This 'rank jumping' is re- 
sponsible for the fact that a given tensor may not have a best approximation by a 
tensor of fixed lower rank, and can thus be of concern in applied settings. 

A tensor that is the limit of tensors of rank r, but not of smaller rank, is said to 
have border rank r. Thus the border rank of a tensor is always less than or equal to 
its rank. 

For instance, while the tensors of complex rank 2 are dense among the 2x2x2 
tensors, there is a unique G(C)-orbit of rank 3 tensors [5], which therefore have border 
rank 2. An orbit representative, called the Werner tensor in the physics literature, is 
usually taken as 

e 2 ® ei ® ei + ei ® e 2 ® ei + ei (g) ei ® e 2 . 

For our purposes, it is more convenient to apply a permutation in the second index, 
so that its 3-slices become 



W 











[(! 


0- 


(o 





and thus have the form described in Proposition 13.31 One may express W as an 
explicit limit of rank 2 tensors using a difference quotient [3] . This difference quotient 
construction generalizes to other formats, to produce simple examples of tensors whose 
rank is larger than their border rank. 

Proposition 13.31 suggests a different way of obtaining W, and many other tensors 
whose rank exceeds their border rank. Our goal in this section is to provide some 
explicit examples. 

In the n = 3 case, consider the tensor given by 3-slices as 



and its perturbation 






K. 






3,e 



Both tensors are 3-slice non-singular and have commuting slices, since for each the 
third slice is the square of the second. 

For arbitrary n, one can similarly construct K„ with slices whose entries are 
all zeros except for successive super-diagonals of Is. Perturbing the diagonal of the 
second slice by adding (0, e, 2e, 3e, . . . , (n — l)e), the other slices can be perturbed to 
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be appropriate powers of the perturbed second slice, so that one obtains K n e with 
all 3-slices commuting. The matrix Z n e of diagonals of the slices of K n <i is then a 
Vandermonde matrix, and hence non-singular for £ ^ 0. 

Now K n t meets the hypotheses of Proposition 13.31 and has slices already upper- 
triangularizcd. Moreover since Z n ^ is non-singular for e ^ 0, it follows that A„ j£ £ 
V(C). 

Since K n is in the closure of all K n>e , we see K £ V n has border rank at most n. 
Since the multlincar rank of K n is (n,n, n), it cannot have border rank less than n, 
so its border rank is exactly n. Since /3(AT„;x) — 0, Theorem 14.11 shows K n ^ £>(C). 
Since this orbit is precisely the tensors of rank n and multilinear rank (n,rt, n), this 
implies the tensor rank of K n must be strictly greater than n. 

We now determine the rank precisely. 

Proposition 8.1. For any n > 0, K n has border rank n and rank 2n — 1, over 

C. 

Proof. The fact that K n has border rank n has been discussed. 

To show the rank is at most 2n — 1 we give an explicit representation, suggested 
by Anders Jensen. We work with a more symmetric tensor K' nl obtained by acting 
on K n by a permutation in the second index, reversing the order of the columns of 
each slice. For example, 

1\ /0 1 0\ /l 0\" 
010 ,100 , 000 . 

1 0/ \0 0/ \0 0/ 

In general K' n will be the n x n x n tensor of all zeros, except for l's in the (i, j, k) 
position when i + j + k = n . + 2. 

Let ( denote a primitive (2n — l)th root of unity. Let v; = (£' , ( 2l , . . . , C" )• Then 
we claim that 

2n-l 
(=1 

and thus K' n has rank at most 2n — 1. Indeed, the (i, j, k) entry of this sum is 

1 2 y c i (i+j+k+n - 3) = f 1 if (2n-l) \(i + j + k + n - 3) _ 
2n - 1 ^ 1 otherwise 

Since n < (i+j + k + n — 3) < 4n — 3, the non-zero entries occur only when 
i + j + k + n — 3 = 2n — 1, i.e., when i + j + fc = n + 2. 

To see the rank is at least 2n — 1, suppose if„ could be expressed as 

2n-2 

A'„ = ^ U/ ® V; ® W; , 

;=i 

with Ui,Vj,Wi € C™. Since the 12 1 3 flattening of K n has rank n, the must span 
C". Thus without loss of generality we may assume B = {wj, W2, . . . , w„} is a basis 
for C™. Let B* = {w*, w£, . . . , w* } be the dual basis. Then for any i € {1, 2, ... , n}, 
w* annihilates at least n — 1 of the w^, so K n w* is a matrix of rank at most 
(2n — 2) — (n — 1) = n — 1. But one sees from the explicit form of A n that K n *3 w* 
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can have rank at most n — 1 only if w* has first coordinate 0. This contradicts that 
B* is a basis. □ 

One can construct many other examples of 'rank jumping' by considering variants 
of the arguments above using different Jordan block structures of the slices. For 
example, 



L = 



can be perturbed to 



'1 0> 
1 
,0 1, 





L r = 




'0 1 N 











.0 0, 




e v(c). 



Here one can see that L has tensor rank at most 4 (by subtracting the third slice from 
the first). Since it also has multilinear rank (3,3,3), and by Theorem 14.11 is not in 
T>(C), its tensor rank must be exactly 4. 

Finally, we note that the maximal C-rank of a n x n x n tensor of border rank n 
for n — 2 is well known to be 3. For n = 3, it is claimed in [TO] that the maximal rank 
is 5. The tensor K n achieves these bounds in both cases. We know of no examples of 
n x n x n tensors of border rank n whose rank exceeds 2n — 1, the rank of K n . It has 
been conjectured by one of us (JAR, see [5]) that no such tensors exist. 

It should be noted that the tensors K n described in this section are similar to 
those given in Theorem 5.6 of [T] of size n x n x (|log 2 n\ +1), whose rank is 2n — 1 
when n — 2 k , and whose border rank has been shown to be n [5D]. (When n ^ 2 k , the 
rank of those tensors is slightly smaller than 2n — 1.) Related examples are also given 
in Corollary 5.7 of [T] of tensors of size n x (n + 1) x (n + 1) and rank approximately 
3n. However the border rank of these has only been shown to be bounded above by 
approximately 2n when n = 2 k |20j . with the precise border rank unknown. Thus it 
is unclear what the gap between rank and border rank is for these last examples. 
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